Introduction to CNNs and LSTMs for NLP

نویسنده

  • Antoine J.-P. Tixier
چکیده

I put together these notes as part of my TA work for the Graph and Text Mining grad course of Prof. Michalis Vazirgiannis in the Spring of 2017. They accompanied a programming lab session about Convolutional Neural Networks (CNNs) and Long Short Term Memory networks (LSTMs) for document classification, using Python and Keras1. Keras is a very popular Python library for deep learning. It is a wrapper for TensorFlow, Theano, and CNTK. To write this handout, I curated information mainly from: the original 2D CNN paper [11] and Stanford’s CS231n CNN course notes, Zhang and Wallace practitioners’ guide to CNNs in NLP [18], the seminal papers on CNN for text classification [8, 9]; Denny Britz’ tutorial on RNNs, and Chris Colah’s post on understanding LSTMs. Last but not least, Yoav Golderg’s primer on neural networks for NLP [5] proved very useful in understanding both CNNs and RNNs. The CNN part of the code can be found on my GitHub here.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast and Accurate Entity Recognition with Iterated Dilated Convolutions

Today when many practitioners run basic NLP on the entire web and large-volume traffic, faster methods are paramount to saving time and energy costs. Recent advances in GPU hardware have led to the emergence of bi-directional LSTMs as a standard method for obtaining pertoken vector representations serving as input to labeling tasks such as NER (often followed by prediction in a linear-chain CRF...

متن کامل

Fast and Accurate Sequence Labeling with Iterated Dilated Convolutions

Today when many practitioners run basic NLP on the entire web and large-volume traffic, faster methods are paramount to saving time and energy costs. Recent advances in GPU hardware have led to the emergence of bi-directional LSTMs as a standard method for obtaining pertoken vector representations serving as input to labeling tasks such as NER (often followed by prediction in a linear-chain CRF...

متن کامل

BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with CNNs and LSTMs

In this paper we describe our attempt at producing a state-of-the-art Twitter sentiment classifier using Convolutional Neural Networks (CNNs) and Long Short Term Memory (LSTMs) networks. Our system leverages a large amount of unlabeled data to pre-train word embeddings. We then use a subset of the unlabeled data to fine tune the embeddings using distant supervision. The final CNNs and LSTMs are...

متن کامل

Duplicate Question Pair Detection with Deep Learning

Determining whether two questions are asking the same thing can be challenging, as word choice and sentence structure can vary significantly. Traditional natural language processing techniques such as shingling have been found to have limited success in separating related question from duplicate questions. Using a dataset of 400,000 labeled question pairs provided by question-and-answer forum Q...

متن کامل

Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks

Various neural network architectures have been proposed in the literature to model 2D correlations in the input signal, including convolutional layers, frequency LSTMs and 2D LSTMs such as time-frequency LSTMs, grid LSTMs and ReNet LSTMs. It has been argued that frequency LSTMs can model translational variations similar to CNNs, and 2D LSTMs can model even more variations [1], but no proper com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017